{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# PA_DPA_3-AES_DPA_Attack\n",
    "\n",
    "Supported setups:\n",
    "\n",
    "SCOPES:\n",
    "\n",
    "* OPENADC\n",
    "* CWNANO\n",
    "\n",
    "PLATFORMS:\n",
    "\n",
    "* CWLITEARM\n",
    "* CWLITEXMEGA - **NOTE: this attack is less reliable on the CWLITEXMEGA than other targets**\n",
    "* CWNANO"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before starting this tutorial, it's recommended that you first complete the earlier PA_DPA tutorials since these will familiarize you with the concept of Differental Power Analysis. With that out of the way, let's look at how this attack works."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## DPA Attack Theory"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we explored in the earlier DPA tutorials, the Hamming Weight of the result of the SBox operation in AES has a measurable effect on the power consumed by the microcontroller. It turns out that just this effect (and not anything stronger, such as its linearity) is enough information to break an AES key. There's a few different ways we could go about this, but for this tutorial, we'll be looking at difference of means. With this technique, the goal is to separate the traces by a bit in the result of the SBox output (it doesn't matter which one): if that bit is 1, its group of traces should, on average, have higher power consumption during the SBox operation than the other set. \n",
    "\n",
    "Whether or not we get a large difference in the means between these two groups depends on whether they were properly sorted into these groups. If not, there should be, on average, little difference between the two and therefore a low difference of means. Recall the SBox operation:\n",
    "\n",
    "![title](https://wiki.newae.com/images/7/71/Sbox_cpa_detail.png)\n",
    "\n",
    "The SBox output depends on the subkey, which we don't know (and the plaintext, which we do). However, since there's a large difference of means for the correct key and small ones for the rest of the possible subkeys, we have a method of checking whether a given subkey is correct. If we calculate the difference of means for each subkey, the correct one will have the largest difference of means.\n",
    "\n",
    "Our plan looks as follows\n",
    "1. Capture a bunch of power traces with varying plaintext\n",
    "1. Group each trace by the value of their SBox output's lowest bit for a given key guess\n",
    "1. Calculate the difference of means\n",
    "1. Repeat for each possible subkey\n",
    "1. Select the largest difference of means -> this is the correct subkey\n",
    "1. Repeat for each subkey in the key\n",
    "\n",
    "At the end, we should get a correct AES key!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "SCOPETYPE = 'OPENADC'\n",
    "PLATFORM = 'CWLITEARM'\n",
    "CRYPTO_TARGET = 'TINYAES128C'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash -s \"$PLATFORM\" \"$CRYPTO_TARGET\"\n",
    "cd ../hardware/victims/firmware/simpleserial-aes\n",
    "make PLATFORM=$1 CRYPTO_TARGET=$2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Capturing Power Traces"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Capture and setup is similar to earlier tutorials. We'll have to capture a fair number of traces (usually a few thousand here) since Difference of Means isn't a super trace efficient method. As you'll find during the CPA tutorials, CPA is much better in this regard - it can often break AES implementations such as these in under 50 traces.\n",
    "\n",
    "You may also find that you need to modify gain settings and the number of traces you capture - this attack is much more sensitive to gain settings and noise than a CPA attack would be. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%run \"Helper_Scripts/Setup_Generic.ipynb\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fw_path = \"../hardware/victims/firmware/simpleserial-aes/simpleserial-aes-{}.hex\".format(PLATFORM)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cw.program_target(scope, prog, fw_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Capture"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Capture Traces\n",
    "from tqdm import tnrange\n",
    "import numpy as np\n",
    "import time\n",
    "\n",
    "ktp = cw.ktp.Basic()\n",
    "\n",
    "traces = []\n",
    "N = 2500  # Number of traces\n",
    "\n",
    "if PLATFORM == \"CWLITEARM\" or PLATFORM == \"CW308_STM32F3\":\n",
    "    scope.adc.samples = 4000\n",
    "elif PLATFORM == \"CWLITEXMEGA\" or PLATFORM == \"CW303\":\n",
    "    scope.gain.db = 34 #works best with this gain for some reason\n",
    "    scope.adc.samples = 1700 - 170\n",
    "    scope.adc.offset = 500 + 700 + 170\n",
    "    N = 5000\n",
    "    \n",
    "print(scope)\n",
    "for i in tnrange(N, desc='Capturing traces'):\n",
    "    key, text = ktp.next()  # manual creation of a key, text pair can be substituted here\n",
    "\n",
    "    trace = cw.capture_trace(scope, target, text, key)\n",
    "    if trace is None:\n",
    "        continue\n",
    "    traces.append(trace)\n",
    "\n",
    "#Convert traces to numpy arrays\n",
    "trace_array = np.asarray([trace.wave for trace in traces])\n",
    "textin_array = np.asarray([trace.textin for trace in traces])\n",
    "known_keys = np.asarray([trace.key for trace in traces])  # for fixed key, these keys are all the same"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Analysis"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we discussed above, our goal here is to find the biggest difference of means out of the possible subkey values we could have. First, we'll get some values and functions that will be useful for our calculations. We'll be using `intermediate()` later to get the output of the SBox from a plaintext and key input."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "numtraces = np.shape(trace_array)[0] #total number of traces\n",
    "numpoints = np.shape(trace_array)[1] #samples per trace\n",
    "\n",
    "sbox = (\n",
    "    0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76,\n",
    "    0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0,\n",
    "    0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc, 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15,\n",
    "    0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a, 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75,\n",
    "    0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0, 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84,\n",
    "    0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b, 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf,\n",
    "    0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85, 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8,\n",
    "    0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5, 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2,\n",
    "    0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17, 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73,\n",
    "    0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88, 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb,\n",
    "    0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c, 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79,\n",
    "    0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9, 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08,\n",
    "    0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6, 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a,\n",
    "    0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e, 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e,\n",
    "    0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94, 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf,\n",
    "    0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68, 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16)\n",
    "\n",
    "def intermediate(pt, keyguess):\n",
    "    return sbox[pt ^ keyguess]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our first step here will be separating our traces into different groups based on the SBox's output. As mentioned earlier, we're separating based on the least significant bit, but really any bit would work (as a test, you can change this and see if the attack still works):\n",
    "\n",
    "```Python\n",
    "one_list = []\n",
    "zero_list = []\n",
    "for tnum in range(numtraces):\n",
    "    if (intermediate(textin_array[tnum][subkey], kguess) & 1):\n",
    "        one_list.append(trace_array[tnum])\n",
    "    else:\n",
    "        zero_list.append(trace_array[tnum])\n",
    "```\n",
    "\n",
    "Then calculate the difference of means:\n",
    "\n",
    "```Python\n",
    "one_avg = np.asarray(one_list).mean(axis=0)\n",
    "zero_avg = np.asarray(zero_list).mean(axis=0)\n",
    "mean_diffs[kguess] = np.max(abs(one_avg - zero_avg))\n",
    "```\n",
    "\n",
    "We'll need to repeat this with each possible key guess and then pick the one with the highest difference of means:\n",
    "```Python\n",
    "guess = np.argsort(mean_diffs)[-1]\n",
    "key_guess.append(guess)\n",
    "print(hex(guess))\n",
    "print(mean_diffs[guess])\n",
    "```\n",
    "\n",
    "Finally, altogether and attacking all of the subkeys:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tqdm import tnrange\n",
    "import numpy as np\n",
    "mean_diffs = np.zeros(255)\n",
    "key_guess = []\n",
    "known_key = known_keys[0]\n",
    "plots = []\n",
    "for subkey in tnrange(0, 16, desc=\"Attacking Subkey\"):\n",
    "    for kguess in tnrange(255, desc=\"Keyguess\", leave=False):\n",
    "        one_list = []\n",
    "        zero_list = []\n",
    "        \n",
    "        for tnum in range(numtraces):\n",
    "            if (intermediate(textin_array[tnum][subkey], kguess) & 1): #LSB is 1\n",
    "                one_list.append(trace_array[tnum])\n",
    "            else:\n",
    "                zero_list.append(trace_array[tnum])\n",
    "        one_avg = np.asarray(one_list).mean(axis=0)\n",
    "        zero_avg = np.asarray(zero_list).mean(axis=0)\n",
    "        mean_diffs[kguess] = np.max(abs(one_avg - zero_avg))\n",
    "        if kguess == known_key[subkey]:\n",
    "            plots.append(abs(one_avg - zero_avg))\n",
    "    guess = np.argsort(mean_diffs)[-1]\n",
    "    key_guess.append(guess)\n",
    "    print(hex(guess) + \"(real = 0x{:02X})\".format(known_key[subkey]))\n",
    "    #mean_diffs.sort()\n",
    "    print(mean_diffs[guess])\n",
    "    print(mean_diffs[known_key[subkey]])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With that done, we should now have the correct key:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(key_guess)\n",
    "print(known_key)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also plot the difference of means for a few of the correct subkey bytes:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from bokeh.plotting import figure, show\n",
    "from bokeh.io import output_notebook\n",
    "\n",
    "output_notebook()\n",
    "p = figure()\n",
    "p.line(range(numpoints), plots[0], line_color='green')\n",
    "p.line(range(numpoints), plots[1], line_color='red')\n",
    "p.line(range(numpoints), plots[15], line_color='blue')\n",
    "show(p)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusion"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Congratulations, you have (hopefully) broken AES using a DPA attack! As you might have discovered during this tutorial, there can be quite a few issues with the difference of means method for breaking AES keys:\n",
    "\n",
    "* It's quite susceptible to noise\n",
    "* The attack can easily pick up other parts of the AES operation\n",
    "* The attack typically requires a lot of traces. These software AES implementations are pretty weak against power analysis, but they still required thousands of traces to break\n",
    "* Some targets (such as the XMega) require fine tuning settings to make the attack work\n",
    "\n",
    "Nevertheless, using a difference of means attack can still be very useful. For example, a later tutorial, PA_Multi_1, uses a difference of means attack similar in concept to this one to break the signature of an AES256 bootloader."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tests"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert (known_key == key_guess).all(), \"Failed to break key.\\nGot: {}\\nExp: {}\".format(key_guess, known_key)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
